Understanding Sampling-based Adversarial Search Methods

نویسنده

  • Ashish Sabharwal
چکیده

Until 2007, the best computer programs for playing the board game Go performed at the level of a weak amateur, while employing the same Minimax algorithm that had proven so successful in other games such as Chess and Checkers. Thanks to a revolutionary new sampling-based planning approach named Upper Confidence bounds applied to Trees (UCT), today's best Go programs play at a master level on full-sized 19 × 19 boards. Intriguingly, UCT's spectacular success in Go has not been replicated in domains that have been the traditional stronghold of Minimax-style approaches. The focus of this thesis is on understanding this phenomenon. We begin with a thorough examination of the various facets of UCT in the games of Chess and Mancala, where we can contrast the behavior of UCT to that of the better understood Minimax approach. We then introduce the notion of shallow search traps — positions in games where short winning strategies for the opposing player exist — and demonstrate that these are distributed very differently in different games, and that this has a significant impact on the performance of UCT. Finally, we study UCT and Minimax in two novel synthetic game settings that permit mathematical analysis. We show that UCT is relatively robust to misleading heuristic feedback if the noise samples are independently drawn, whereas systematic biases in a heuristic can cause UCT to prematurely " freeze " onto sub-optimal lines of play and thus perform poorly. We conclude with a discussion of the potential avenues for future work. Raghuram Ramanujan is a PhD candidate in Computer Science at Cornell University, where he has collaborated with Bart Selman and Ashish Sabharwal on research problems related to algorithms for computer game playing. Prior to Cornell, he was an undergraduate at Purdue University, where he earned a B.S. in Computer Engineering, with a minor in Economics. His interest in Artificial Intelligence was stoked by his undergraduate research work in machine learning and planning, that was carried out under the supervision of Robert Gi-van and Alan Fern. In the distant past, he spent a couple of years in Singapore completing his GCE 'A' Levels as an SIA Youth Scholar. Outside of academia, he is an accomplished nature and wildlife photographer whose work has been featured on the website of the Cornell Lab of Ornithology and in a field guide to birding in the Finger Lakes region. iii ACKNOWLEDGEMENTS This thesis would …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Understanding Sampling Style Adversarial Search Methods

UCT has recently emerged as an exciting new adversarial reasoning technique based on cleverly balancing exploration and exploitation in a Monte-Carlo sampling setting. It has been particularly successful in the game of Go but the reasons for its success are not well understood and attempts to replicate its success in other domains such as Chess have failed. We provide an in-depth analysis of th...

متن کامل

On the Behavior of UCT in Synthetic Search Spaces

UCT and Minimax are two of the most prominent tree-search based adversarial reasoning strategies for a variety of challenging domains, such as Chess and Go. Their complementary strengths in different domains have been the motivation for several works attempting to achieve a better understanding of their vastly different behavior. Rather than using complex games as a testbed for deriving indirec...

متن کامل

On Adversarial Search Spaces and Sampling-Based Planning

Upper Confidence bounds applied to Trees (UCT), a banditbased Monte-Carlo sampling algorithm for planning, has recently been the subject of great interest in adversarial reasoning. UCT has been shown to outperform traditional minimax based approaches in several challenging domains such as Go and Kriegspiel, although minimax search still prevails in other domains such as Chess. This work provide...

متن کامل

Omputation and D Ecision - M Aking in L Arge E Xtensive F Orm G Ames

In this thesis, we investigate the problem of decision-making in large two-player zero-sumgames using Monte Carlo sampling and regret minimization methods. We demonstrate fourmajor contributions. The first is Monte Carlo Counterfactual Regret Minimization (MC-CFR): a generic family of sample-based algorithms that compute near-optimal equilibriumstrategies. Secondly, we develop a...

متن کامل

Sparse Sampling for Adversarial Games

This paper introduces Monte Carlo *-Minimax Search (MCMS), a Monte-Carlo search algorithm for finite, turned based, stochastic, two-player, zero-sum games of perfect information. Through a combination of sparse sampling and classical pruning techniques, MCMS allows deep plans to be constructed. Unlike other popular tree search techniques, MCMS is suitable for densely stochastic games, i.e., gam...

متن کامل

Adversarial Texts with Gradient Methods

Adversarial samples for images have been extensively studied in the literature. Among many of the attacking methods, gradient-based methods are both effective and easy to compute. In this work, we propose a framework to adapt the gradient attacking methods on images to text domain. The main difficulties for generating adversarial texts with gradient methods are: (i) the input space is discrete,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012